from helpers import create_database_engine, run_sql_and_return_df, run_sql_and_return_html, create_db_wrapper, execute_ddl_from_file
# Load these variables from .env file.
config_map = {
'user': "CMSC408_HW6_USER",
'password': "CMSC408_HW6_PASSWORD",
'host': "CMSC408_HW6_HOST",
'database': "CMSC408_HW6_DB_NAME"
}
cnx,config = create_db_wrapper( config_map )Homework 7 - More on joins
Last week, we worked with setting up a database. For this assignment, we are going to take that one step further and add in a few more tables to give us more practice working with join statements inside of SQL!
GITHUB URL: https://github.com/cmsc-vcu/cmsc408-sp2025-hw7-serrotrehpotsirhc.git
Overview and description
The database will represent a company-wide resume, capturing the collective skills and certifications of all the people in your company. The database will capture skills, employees and their roles within the company. The database will help our employer find employees with specific skills, list the skills of any employee and help find gaps in the skill sets of employees in specific roles.
Crows-foot diagram
This is the Crows Foot diagram for this database. People have skills. People can have many skills, and skills can each have many people. Both People and Skills are each One to Many toward each other, making both Many to Many.
erDiagram
people ||--o{ peopleskills : has
skills ||--o{ peopleskills : includes
people {
int id PK
string first_name
string last_name
string email
string linkedin_url
string headshot_url
string discord_handle
string brief_bio
date date_joined
}
skills {
int id PK
string name
string description
string tag
string url
string extra
}
peopleskills {
int id PK
int skills_id FK
int people_id FK
datetime date_acquired
}
Loading the database
ddl_file_name = "./my-ddl.sql"
messages,errors = execute_ddl_from_file( ddl_file_name, cnx)
if errors:
for error in errors:
print(f"{error}<br/>")
else:
print(f"No errors detected while loading: {ddl_file_name}")No errors detected while loading: ./my-ddl.sql
Examples of data in the database
The following sections provide an overview of the schema including table names, and number of rows and columns in each table.
For the people, skills, and roles tables, a description of each table is presented along with it’s contents.
Tables and metrics in the database
| table_name | row_count | column_count | |
|---|---|---|---|
| 0 | countries | 25 | 3 |
| 1 | departments | 27 | 4 |
| 2 | emp_details_view | 106 | 16 |
| 3 | employees | 107 | 11 |
| 4 | job_grades | 6 | 3 |
| 5 | job_history | 10 | 5 |
| 6 | jobs | 19 | 4 |
| 7 | locations | 23 | 6 |
| 8 | people | 10 | 9 |
| 9 | peopleroles | 15 | 4 |
| 10 | peopleskills | 26 | 4 |
| 11 | regions | 4 | 2 |
| 12 | roles | 6 | 3 |
| 13 | skills | 8 | 6 |
People table
The people table contains a SQL Select call to print the entire people table, using the *.
Below is a list of data in the people table.
run_sql_and_return_html( cnx,f"""
select * from people;
""" )| id | first_name | last_name | linkedin_url | headshot_url | discord_handle | brief_bio | date_joined | |
|---|---|---|---|---|---|---|---|---|
| 1 | Avery | Person 1 | avery1@example.com | https://linkedin.com/in/avery1 | https://example.com/headshots/avery.jpg | @avery01 | Fitness enthusiast and tech blogger. | 2025-01-01 |
| 2 | Jordan | Person 2 | jordan2@example.com | https://linkedin.com/in/jordan2 | https://example.com/headshots/jordan.jpg | @jordan02 | Mechanical engineer who loves extreme sports. | 2025-01-01 |
| 3 | Taylor | Person 3 | taylor3@example.com | https://linkedin.com/in/taylor3 | https://example.com/headshots/taylor.jpg | @taylor03 | Writer and hobbyist swimmer. | 2025-01-01 |
| 4 | Morgan | Person 4 | morgan4@example.com | https://linkedin.com/in/morgan4 | https://example.com/headshots/morgan.jpg | @morgan04 | Artist focused on sustainable designs. | 2025-01-01 |
| 5 | Riley | Person 5 | riley5@example.com | https://linkedin.com/in/riley5 | https://example.com/headshots/riley.jpg | @riley05 | Racing enthusiast and cliff diver. | 2025-01-01 |
| 6 | Skylar | Person 6 | skylar6@example.com | https://linkedin.com/in/skylar6 | https://example.com/headshots/skylar.jpg | @skylar06 | Tech consultant with a passion for flying. | 2025-01-01 |
| 7 | Casey | Person 7 | casey7@example.com | https://linkedin.com/in/casey7 | https://example.com/headshots/casey.jpg | @casey07 | Athlete and mural enthusiast. | 2025-01-01 |
| 8 | Quinn | Person 8 | quinn8@example.com | https://linkedin.com/in/quinn8 | https://example.com/headshots/quinn.jpg | @quinn08 | Entrepreneur and adventure lover. | 2025-01-01 |
| 9 | Reese | Person 9 | reese9@example.com | https://linkedin.com/in/reese9 | https://example.com/headshots/reese.jpg | @reese09 | Freelancer and swimmer. | 2025-01-01 |
| 10 | Parker | Person 10 | parker10@example.com | https://linkedin.com/in/parker10 | https://example.com/headshots/parker.jpg | @parker10 | Strength coach and motivational speaker. | 2025-01-01 |
| Total Rows: 10, Total Columns: 9 | ||||||||
Skills table
The skills table contains a SQL Select call to print the entire skills table, using the *.
Below is a list of data in the skills table.
run_sql_and_return_html( cnx,f"""
select * from skills;
""" )| id | name | description | tag | url | time_commitment |
|---|---|---|---|---|---|
| 1 | Rope Jumping | Jumping Rope | Skill 1 | www.google.com | NULL |
| 2 | Sky Diving | Jumping Rope | Skill 2 | www.google.com | NULL |
| 3 | Race Car Driving | Jumping Rope | Skill 3 | www.google.com | NULL |
| 4 | Heavy Lifting | Jumping Rope | Skill 4 | www.google.com | NULL |
| 5 | Swimmer | Jumping Rope | Skill 5 | www.google.com | NULL |
| 6 | Cliff Jumping | Jumping Rope | Skill 6 | www.google.com | NULL |
| 7 | Mural Artist | Jumping Rope | Skill 7 | www.google.com | NULL |
| 8 | Author | Jumping Rope | Skill 8 | www.google.com | NULL |
| Total Rows: 8, Total Columns: 6 | |||||
Roles table
The roles table contains the roles of each employee, which are defined in our MY-DL.SQl File. To call all of these roles for this python snippet, we wil use a select call with an * to print ALL the roles inside the table!
Below is a list of data in the roles table.
run_sql_and_return_html( cnx,f"""
select * from roles;
""" )| id | name | sort_priority |
|---|---|---|
| 1 | Designer | 10 |
| 2 | Developer | 20 |
| 3 | Recruit | 30 |
| 4 | Team Lead | 40 |
| 5 | Boss | 50 |
| 6 | Mentor | 60 |
| Total Rows: 6, Total Columns: 3 | ||
Sample queries
List skill names of Person 5
run_sql_and_return_html( cnx,f"""
SELECT s.name
FROM skills s
JOIN peopleskills ps ON s.id = ps.skills_id
WHERE ps.people_id = 5;
""" )| name |
|---|
| Race Car Driving |
| Cliff Jumping |
| Total Rows: 2, Total Columns: 1 |
List people with Skill 2
run_sql_and_return_html( cnx,f"""
SELECT p.first_name, p.last_name
FROM people p
JOIN peopleskills ps ON p.id = ps.people_id
WHERE ps.skills_id = 2;
""" )| first_name | last_name |
|---|---|
| Skylar | Person 6 |
| Reese | Person 9 |
| Total Rows: 2, Total Columns: 2 | |
List people with a DEVELOPER role
run_sql_and_return_html( cnx,f"""
SELECT CONCAT(p.first_name, ' ', p.last_name) AS person_name,
r.name AS role_name
FROM peopleroles pr
JOIN roles r ON pr.role_id = r.id
JOIN people p ON pr.people_id = p.id
WHERE r.name = 'Developer';
""" )| person_name | role_name |
|---|---|
| Avery Person 1 | Developer |
| Taylor Person 3 | Developer |
| Skylar Person 6 | Developer |
| Reese Person 9 | Developer |
| Parker Person 10 | Developer |
| Total Rows: 5, Total Columns: 2 | |
List names and email addresses of people without skills
run_sql_and_return_html( cnx,f"""
SELECT p.first_name, p.last_name, p.email
FROM people p
LEFT JOIN peopleskills ps ON p.id = ps.people_id
WHERE ps.skills_id IS NULL;
""" )| first_name | last_name | |
|---|---|---|
| Morgan | Person 4 | morgan4@example.com |
| Total Rows: 1, Total Columns: 3 | ||
List people names and skill names with the BOSS role
run_sql_and_return_html( cnx,f"""
SELECT CONCAT(p.first_name, ' ', p.last_name) AS person_name,
s.name AS skill_name,
r.name AS role_name
FROM peopleroles pr
JOIN roles r ON pr.role_id = r.id
JOIN people p ON pr.people_id = p.id
JOIN peopleskills ps ON ps.people_id = p.id
JOIN skills s ON s.id = ps.skills_id
WHERE r.name = 'Boss';
""" )| person_name | skill_name | role_name |
|---|---|---|
| Jordan Person 2 | Race Car Driving | Boss |
| Jordan Person 2 | Heavy Lifting | Boss |
| Jordan Person 2 | Swimmer | Boss |
| Total Rows: 3, Total Columns: 3 | ||
List ids and names of unused roles
run_sql_and_return_html( cnx,f"""
SELECT r.id, r.name
FROM roles r
LEFT JOIN peopleroles pr ON r.id = pr.role_id
WHERE pr.people_id IS NULL;
""" )| id | name |
|---|---|
| no records returned | |
| Total Rows: 1, Total Columns: 2 | |
List people and the number of skills they have acquired.
(Goal: Practice JOIN, GROUP BY, and COUNT())
run_sql_and_return_html( cnx,f"""
SELECT
p.id AS person_id,
CONCAT(p.first_name, ' ', p.last_name) AS person_name,
COUNT(ps.skills_id) AS skill_count
FROM
people p
LEFT JOIN peopleskills ps ON p.id = ps.people_id
GROUP BY
p.id, p.first_name, p.last_name
ORDER BY
skill_count DESC, person_name;
""" )| person_id | person_name | skill_count |
|---|---|---|
| 8 | Quinn Person 8 | 4 |
| 1 | Avery Person 1 | 3 |
| 7 | Casey Person 7 | 3 |
| 2 | Jordan Person 2 | 3 |
| 10 | Parker Person 10 | 3 |
| 9 | Reese Person 9 | 3 |
| 6 | Skylar Person 6 | 3 |
| 5 | Riley Person 5 | 2 |
| 3 | Taylor Person 3 | 2 |
| 4 | Morgan Person 4 | 0 |
| Total Rows: 10, Total Columns: 3 | ||
List each skill and the number of people who have acquired it.
(Goal: Practice JOIN, GROUP BY, COUNT())
run_sql_and_return_html( cnx,f"""
SELECT
skills.tag,
COUNT(peopleskills.skills_id)
FROM
skills
left join peopleskills
on (skills.id=peopleskills.skills_id)
GROUP BY
skills.tag
ORDER BY
COUNT(peopleskills.skills_id) DESC
""" )| tag | COUNT(peopleskills.skills_id) |
|---|---|
| Skill 3 | 6 |
| Skill 5 | 6 |
| Skill 6 | 5 |
| Skill 1 | 4 |
| Skill 4 | 3 |
| Skill 2 | 2 |
| Skill 7 | 0 |
| Skill 8 | 0 |
| Total Rows: 8, Total Columns: 2 | |
List all people who have acquired more than 2 skills
(Goal: Join through multiple tables and use HAVING)
run_sql_and_return_html( cnx,f"""
SELECT
p.id AS person_id,
CONCAT(p.first_name, ' ', p.last_name) AS person_name,
COUNT(ps.skills_id) AS skill_count
FROM
people p
JOIN peopleskills ps ON p.id = ps.people_id
GROUP BY
p.id, p.first_name, p.last_name
HAVING
COUNT(ps.skills_id) > 2
ORDER BY
skill_count DESC, person_name;
""" )| person_id | person_name | skill_count |
|---|---|---|
| 8 | Quinn Person 8 | 4 |
| 1 | Avery Person 1 | 3 |
| 7 | Casey Person 7 | 3 |
| 2 | Jordan Person 2 | 3 |
| 10 | Parker Person 10 | 3 |
| 9 | Reese Person 9 | 3 |
| 6 | Skylar Person 6 | 3 |
| Total Rows: 7, Total Columns: 3 | ||
Reflection
- What did you like most about this project?
- This project retained the architecture of Homework 6, so it was easy to get started, and I was less worried about trying to get everything working this time around. I wish more assignments were like this, instead of every assignment being daunting just to get up and running. By eliminating this step, I was actually able to focus on the work and apply concepts from class!
- What was most challenging about this project?
- Very little difficulty here. Logically working through how to define and populate the roles table, but that was straightforward from Homework 6. The python code in Report.QMD was difficult at first. It was more conceptualizing what tables I needed to join rather than simply implementing it!
- What would you do different next time?
- Not much! I love watching the lectures! I would probably run through a few more problem sets of this code. There should be a website graded for completion which generates SQL commands, where you can go through a bunch of implementation problems without worrying about your entire program not running or whatnot! This would isolate the skills of SQL without causing the stress of worrying that a mistake in code will be the reason the entire program doesn’t work! Over time, you would get better at SQL and be confident enough to work through implementation issues!
README
Below is the README from my project.
CMSC408 - Fall 2024 - Homework 7
Hello!
Last week, in Homework 6, I set up connection to the remote HR database, and started to work with tables defined and popoulated in the MY-DDL.SQL file from Homework 6! We used Select statements last week, to pull the tables we wanted to from the database using SQL commands! This week, we are going to repeat the basic information of Homework 6, while implementing addition SQL problem sets using JOIN commands to incorporate the newly formed Roles and Peopleroles tables inside this newly updated MY-DDL.SQL file in Homework 7!
We are going to be writing the SQL commands to manipulate the HR database through python code inside of:
Report.QMD
We are going to create and populate the Roles and Peopleroles tables using DDL inside of:
MY-DDL.SQL
Final Notes
Each of these files are inside the reports folder. Our credentials are still stored in the .env file. We need SQL, Quarto, Poetry, and Pandas installed, with the SQLTools extension to work through SQL commands. We will use Quarto to render the Report.QMD file into an HTML deliverable!
Thank you!